Providing Fault Tolerance in Grid Computing Systems
نویسنده
چکیده
In grid computing, resources are used outside the boundary of organizations and it becomes increasingly difficult to guarantee that resources being used are not malicious. Also, resources may enter and leave the grid at any time. So, fault tolerance is a crucial issue in grid computing. Fault tolerance can enhance grid throughput, utilization, response time and more economic profits. All mechanisms proposed to deal with fault-tolerant issues in grids are classified into: job replication and job checkpointing techniques. These techniques are used according to the requirements of the computational grid and the type of environment, resources and virtual organizations it is supposed to work with. Each has its own advantages and disadvantages which forms the subject matter of this paper.
منابع مشابه
Stability Assessment Metamorphic Approach (SAMA) for Effective Scheduling based on Fault Tolerance in Computational Grid
Grid Computing allows coordinated and controlled resource sharing and problem solving in multi-institutional, dynamic virtual organizations. Moreover, fault tolerance and task scheduling is an important issue for large scale computational grid because of its unreliable nature of grid resources. Commonly exploited techniques to realize fault tolerance is periodic Checkpointing that periodically ...
متن کاملFault Tolerance Improvement techniques in Grid Computing
Grid computing is a distributed computing model that provides access to the geographically distributed heterogeneous resources. These computational grid systems are highly unreliable in nature because of which they need fault tolerance to be an integral part of the system to increase reliability. Commonly utilized techniques for providing fault tolerance are discussed. This paper provides state...
متن کاملFault Tolerance within a Grid Environment
Fault tolerance is an important property in Grid computing as the dependability of individual Grid resources may not be able to be guaranteed; also as resources are used outside of organizational boundaries, it becomes increasingly difficult to guarantee that a resource being used is not malicious in some way. As part of the e-Demand project at the University of Durham we are seeking to develop...
متن کاملGrid Computing and Fault Tolerance Approach
Grid computing is a means of allocating the computational power of a large number of computers to complex difficult computation or problem. Grid computing is a distributed computing paradigm that differs from traditional distributed computing in that it is aimed toward large scale systems that even span organizational boundaries. This paper proposes a method to achieve maximum fault tolerance i...
متن کاملAn approach to fault detection and correction in design of systems using of Turbo codes
We present an approach to design of fault tolerant computing systems. In this paper, a technique is employed that enable the combination of several codes, in order to obtain flexibility in the design of error correcting codes. Code combining techniques are very effective, which one of these codes are turbo codes. The Algorithm-based fault tolerance techniques that to detect errors rely on the c...
متن کامل